Ontology from Local Hierarchical Structure in Text

نویسندگان

  • Fionn Murtagh
  • Josiane Mothe
  • Kurt Englmeier
چکیده

We study the notion of hierarchy in the context of visualizing textual data and navigating text collections. A formal framework for " hierarchy " is given by an ultrametric topology. This provides us with a theoretical foundation for concept hierarchy creation. A major objective is scalable annotation or labeling of concept maps. Serendipitously we pursue other objectives such as deriving common word pair (and triplet) phrases, i.e., word 2-and 3-grams. We evaluate our approach using (i) a collection of texts, (ii) a single text subdivided into successive parts (for which we provide an interactive demonstrator), and (iii) a text subdivided at the sentence or line level. While detailing a generic framework, a distinguishing feature of our work is that we focus on locality of hierarchic structure in order to extract semantic information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Prosody of Discourse Structure and Content in the Production of Persian EFL Learners

The present research addressed the prosodic realization of global and local text structure and content in the spoken discourse data produced by Persian EFL learners. Two newspaper articles were analyzed using Rhetorical Structure Theory. Based on these analyses, the global structure in terms of hierarchical level, the local structure in terms of the relative importance of text segments and the ...

متن کامل

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

Symbolic Dynamics in Text: Application to Automated Construction of Concept Hierarchies

Following a symbolic encoding of selected terms used in text, we determine symmetries that are furnished by local hierarchical structure. We develop this study so that hierarchical fragments can be used in a concept hierarchy, or ontology. By “letting the data speak” in this way, we avoid the arbitrariness of approximately fitting a model to the data.

متن کامل

Automatic Document Topic Identification Using Hierarchical Ontology Extracted from Human Background Knowledge

The rapid growth in the number of documents available to end users from around the world has led to a greatly-increased need for machine understanding of their topics, as well as for automatic grouping of related documents. This constitutes one of the main current challenges in text mining. In this work, a novel technique is proposed, to automatically construct a background knowledge structure ...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Ontology Building using Parallel Enumerative Structures

The semantics of a text is carried by both the natural language it contains and its layout. As ontology building processes have so far taken only plain text into consideration, our aim is to elicit its textual structure. We focus here on parallel enumerative structures because they bear implicit or explicit hierarchical relations, they have salient visual properties, and they are frequently fou...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/cs/0701180  شماره 

صفحات  -

تاریخ انتشار 2007